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Tins  tutorial  paper  describes  the  maximum  entropy  spectrum  and  the 
Burg  technique  for  computing  the  prediction  error  power  and  prediction  error 
filter  coefficients  in  the  associated  spectral  estimation  formula.  The  maxi- 
mum entropy  spectrum  is  identical  to  the  autoregressive  spectral  estimator. 
Also  included  in  this  paper  is  a discussion  of  the  K-line  spectrum,  which  is 
the  wavenumber  analogue  of  the  frequency-domain  maximum  entropy  spec- 
trum, and  the  Burg  technique  modifications  necessary  for  its  implementatio 
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The  purpose  of  this  paper  is  to  provide  a complete  and  self- 
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formal  published  literature,  they  are  incorporated  in  this  paper.  Sup- 
porting material  and  various  sidelights  of  the  maximum  entropy  spectrum 
appear  in  the  appendices. 
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ABSTRACT 


This  tutorial  paper  describes  the  maximum  entropy  spectrum  and  the 
Burg  technique  for  computing  the  prediction  error  power  and  prediction  er- 
ror filter  coefficients  in  the  associated  spectral  estimation  formula.  The 
maximum  entropy  spectrum  is  identical  to  the  autoregressive  spectral  esti- 
mator. Also  included  in  this  paper  is  a discussion  of  the  K-line  spectrum, 
which  is  the  wavenumber  analogue  of  the  frequency-domain  maximum  entropy 
spectrum,  and  the  Burg  technique  modifications  necessary  for  its  implemen- 
tation. 

The  purpose  of  this  paper  is  to  provide  a complete  and  self-contained 
account  of  the  main  features  of  the  maximum  entropy  spectrum.  Since  many 
of  the  relevant  mathematical  derivations  are  not  found  in  the  formal  published 
literature,  they  are  incorporated  in  this  paper.  Supporting  material  and  var- 
ious sidelights  of  the  maximum  entropy  spectrum  appear  in  the  appendices. 
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The  maximum  entropy  spectrum  is  an  outgrowth  of  the  deconvolution 
filtering  technique  long  used  in  oil-exploration  data  processing.  A deconvolu- 
tion filter  is  a whitening  filter  whose  purpose  is  to  sharpen  the  images  of  a 
fieismic  profile.  John  Parker  Burg  in  the  early  1960's  noticed  that  high- 
resolution  power  density  spectra  could  be  computed  using  the  reciprocal  of  the 
squared  amplitude  response  of  the  deconvolution  Biter.  This  form  of  spectral 
estimator  was  known  as  the  Markov  spectrum  and  is  identical  to  the  autore- 
gressive spectral  estimator  independently  developed  and  described  in  the  sta- 
tistical literature.  Later  Burg  recognized  that  the  Markov  spectrum  is  the 
maximum  entropy  spectrum  (Burg,  1967)  of  all  possible  power  spectra  agree- 
ing with  the  measured  autocorrelation  function  values.  In  addition.  Burg  de- 
veloped a method  (Burg,  1968)  for  directly  calculating  the  coefficients  of  the 
deconvolution  filter  (or  prediction  error  filter)  used  in  the  spectral  estimate. 
This  method  produces  more  accurate  spectra  and  minimizes  the  problem  of 
end  effects. 

Most  of  the  problems  encountered  in  conventional  spectral  estimation 
are  remedied  by  the  maximum  entropy  spectrum  and  the  Burg  technique  for 
calculating  the  prediction  error  filter  coefficients.  For  example,  the  inverse 
Fourier  transform  of  the  maximum  entropy  spectrum  agrees  with  the  measured 
autocorrelation  function  values,  whereas  this  is  true  in  conventional  spectral 
estimation  only  when  a rectangular  taper  is  applied  to  the  autocorrelation  func- 
tion. Again,  in  computing  conventional  spectral  estimates,  only  a rectangular 
taper  preserves  the  autocorrelation  function  intact.  The  maximum  entropy 


spectrum,  on  the  other  hand,  always  uses  the  autocorrelation  function  in  un- 
modified form.  The  Burg  technique,  furthermore,  guarantees  that  the  power 
spectrum  is  always  positive,  that  the  prediction  error  filter  is  minimum  phase 
and  that  the  autocorrelation  matrix  corresponding  to  the  prediction  error  filter 
coefficients  is  always  non-negative  definite.  With  conventional  spectral  esti- 
mates, a zero  extension  of  the  autocorrelation  function  is  implicitly  assumed. 
In  some  cases,  this  assumption  is  unreasonable  and  produces  negative  power 
in  the  spectral  estimate.  In  all  cases,  the  truncation  of  the  autocorrelation 
function  produces  lower  resolution  than  the  maximum  entropy  spectrum,  which 
achieves  its  increased  resolution  through  an  optimal  extension  of  the  autocor- 
relation function. 

The  purpose  of  this  survey  paper  is  to  discuss  the  important  features 
of  the  maximum  entropy  spectrum  and  the  Burg  technique  and  to  present  the 
relevant  derivations,  which  are  often  not  readily  available.  The  exposition  re- 
lies heavily  on  Burg's  first  two  published  papers.  Section  II  deals  with  the 
maximum  entropy  spectrum.  Section  III  with  the  Burg  technique,  and  Section 
IV  with  the  K-line  spectrum,  which  is  the  wavenumber  analogue  of  the  maxi- 
mum entropy  spectrum.  The  appendices  contain  supporting  material  obtained 
from  a number  of  sources,  which  are  referenced  for  the  reader's  benefit. 


SECTION  n 

THE  MAXIMUM  ENTROPY  SPECTRUM 


Given,  an  infinite -length  discrete  time  series  (possibly  complex)  with 

elements  (•  * • • x ^ j T*  ••••  ^ ^ » ^0  * ^1  * • • • » 1 * "^T  * • • • ) sam- 

pled at  the  time  interval  At,  the  associated  NXN  autocorrelation  matrix  is 


Xt+N-1 


r(0) 

r(l)  

r(N-l) 

r*(l) 

r(0)  

• 

• 

• 

r*(N-l) 

• 

r*(N-2).... 

r(0) 

where  the  vinculum  denotes  averaging  over  time,  the  asterisk  denotes  com- 
plex conjugate,  and  the  elements 

T 

...  lim  1 \ S * , . ^ 

" T_»  2T+1  / ^ XtXt+j  (l-N_j_N-l) 

t=-T 


of  the  matrix  are  the  autocorrelation  function  values  at  the  lags  jAt.  The 

U 

autocorrelation  matrix  is  Hermitian  (RN  = R^  , where  the  superscript  H 
denotes  conjugate  transpose);  Toplitz  (having  identical  elements  along  each 
diagonal),  and  non-negative  definite  (V  Hrnv  > 0 for  all  non-zero  N- com- 
ponent column  vectors  V). 


II- 1 


Of  all  possible  power  spectra  P(f)  whose  inverse  Fourier  trans- 


forms 

SW/2 

P(f)  el27rfjAt  df 

- W/2 

agree  with  the  available  autocorrelation  function  values  r(j)  from  j = 1-N  to 
j = N-l,  the  maximum  entropy  spectrum  (Burg,  1967)  is  the  spectrum  maxi- 
mizing the  integral 


-W/2 

1 log  P(f)  df  . 

-W/2 

which  is  proportional  to  the  entropy  of  a Gaussian  band-limited  time  series 
with  power  spectrum  P(f)  and  bandwidth  W = 1/At  corresponding  to  the 
sampling  interval  At.  Entropy  maximization  subject  to  the  specified  con- 
straints on  the  power  spectrum  occurs  when 


.W/2 

-W/2 


log  P(f)  df  - 


P(f) 


,i2’£ia,di 


r(j) 


reaches  its  maximum  value  through  the  proper  choice  of  the  power  spectrum 
P(f).  The  quantities  b.  (j  = 1-N,  ...»  -1,  0,  1,  ...»  N-l)  are  complex- 
valued Lagrangian  multipliers.  To  f atisfy  the  extremal  condition,  the  partial 
derivative  of  the  integrand  with  respect  to  the  power  spectrum  must  be  zero: 

log  P(f)  - ^ b.  P(f)  JZwii*x 

j=l-N 


0 - 


«P(f) 


11-2 


i. 


Plf) 


N-l 

E.  i2trfiAt 

V 

j=  l-N 


Thus  the  form  of  the  maximum  entropy  spectrum  is 


P(f)  = 


N-l  N-l 

E vi2,fj“  E v1 

j = l-N  j=  1 -N 


where  a = e . To  make  the  power  spectrum  real- valued  and  to 

* 

satisfy  the  conditions  r(-j)  = r (j),  the  Lagrangian  multiplier  b . must  be 

a 

the  complex  conjugate  of  the  multiplier  b.  whose  subscript  has  the  oppo- 
site sign. 

The  equation 
N-l 


h.  z 
J 


-J 


= 0 


j=  l-N 


* -1 

has  2(N-1)  roots.  If  a is  a root,  then  (z  ) is  also  a root:  if 


then 


Thus  the  number  of  roots  for  which  | zj>|  is  equal  to  the  number  of  roots  for 
which  |z|<|.  If  no  roots  lie  on  the  unit  circle,  there  are  N-l  roots  outside 
the  unit  circle  and  N-l  roots  inside  the  unit  circle.  The  denominator  of  the 
maximum  entropy  spectrum  can  then  be  written 


W T -1  -2 

p-  1 + a2Z  + a3Z  + 

N L 


+ aNZ 


-(N- 


i)V 


* 

1 4 z 
2 


+ 


2 
z + 


where  is  real.  All  of  the  roots  within  the  unit  circle  are  incorporated 
in  the  factor  at  the  left.  Thus  the  maximum  entropy  spectrum  is  equal  to 


where  the  N-component  column  vectors  V and  A are,  respectively, 


II-4 
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The  constraints  embodied  in  the  matrix  equation 


NlZ 

P 

W/2 

1 P(£)  VVHdf  = 

-W/2 

N 

W 

-W/2 

VV 


H 


df 


vhaahv 


~r(0) 

r(l) 

• r(N-l) 

r*(l) 

r(0).V 

. r(N-2) 

• 

♦ 

• 

• 

r*(N~l) 

! "... 
r*(N-2) 

• c(0) 

determine  and  the  components  a^»  aj^  °*  t^e  column  vector  A 

Multiplication  on  the  right  by  the  column  vector  A yields 


d( 

(VHA)(AHV) 


rna 


or 


—I 

i2ir  J 


zAHV 


dz  = 


= 


(■ 


dz  = 


-i2ff  z 
W 


df) 


! i 


1 

i 


where  the  contour -integral  path  runs  counterclockwise  along  the  unit  circle. 

-i2vfAt 

(Revere  a!  cf  the  clockwise  path  z = e corresponding  to  increasing 

frequex  cy  eliminates  the  minus  sign  in  the  expression  for  dz. ) The  j-th 
comf  jnent  of  *na  is 


Since  the  factor  V A contains  all  of  the  roots  within  the  unit  circle,  the 

pj 

other  factor  A V is  never  equal  to  zero  within  the  unit  circle.  For  all  com 
ponents  j from  2 to  N,  the  power  of  z is  non -negative  and  the  integrand 
contains  no  poles  within  the  unit  circle.  For  j = 1,  however,  there  is  a 
simple  pole  at  z = 0 whose  residue  is 


lim 

z~»0 


l + a,  a + a. 


2 

z + 


* IN- 
+ a z 
N 


1. 


Cauchy' s residue  theorem  implies,  therefore,  that  the  first  component  of 
Rj^A.  is  P^  and  that  the  remaining  components  from  the  second  through  the 
N-th  are  zero.  Thus  the  maximum  entropy  spectrum  is 


P(f)  = 


Vw 

vhaahv 


pnm 


N-l 


„ _i2rfj At 
j=l 


1 + J «j+1o 


where  P and  a_,  a..,  ...  , 


a are  solutions  of  the  matrix  equation 


RELATIONSHIP  BETWEEN  PREDICTION  FILTER 
AND  PREDICTION  ERROR  FILTER 


r(l)  r{0) 


r(N-2)  r(N-3)> 


r (N-3) 


t+  1 t+  2 . 


t+N-1 


_Vn-i 


*3  = xt-i  .? 


t+N-1 


the  design  equation  for  an  optimum  (N-l) -point-long  backward  prediction 

* * * 

filter  with  weights  (-a^  » - - • , -a^»  “®2  )»  as  s^0,wm  in  Figure U- 1(b). 

If  the  prediction  filter  output  is  subtracted  from  the  actual  value  xt>  the 
result  is  the  prediction  error,  which  can  be  produced  from  a filter  with 
unity  weight  at  zero  lag  and  the  negative  of  the  prediction  filter  weights  at 
their  respective  lags.  Figures  II-l(c)  and  II-l(d)  illustrate  the  forward  and 
backward  prediction  error  filters.  The  term  P^T  in  the  maximum  entropy 
soectrum  formula  is  the  mean  square  error  of  the  prediction  filter  and 
the  power  of  the  prediction  error  filter  output: 


[ V 


aNJ  r(°* 

r?(l)  r(0): 


r(N-l)  1 

r(N  -2)  a. 


* * 

r (N-i)  r (N-2)< 


Inverse  Fourier  transformation  of  the  maximum  entropy  spectrum 
produces  autocorrelation  function  estimates  at  lags  greater  than  N-l.  Addi 
tion  of  the  element  z = e *fNdt  ^ ^ y permits  the  (N+  1)  X 

(N-f  1)  autocorrelation  matrix  to  be  determined; 
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r (N-Z)*®*— 

■r(C) 

1 
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- — — — « 

r (N) 

r (N-l). 

r (1) 
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r(0) 

Multiplication  on  the  right  by  the  extended  prediction  error  filter  column 
vector  and  transformation  to  die  z-plane  yields 


where  the  contour  path  runs  counterclockwise  along  the  unit  circle.  The 

N- 1 H 

(N+  l)-st  component  is  zero  since  all  poles  of  the  integrand  z /(A  V)  lie 

* 

outside  the  unit  circle.  The  value  r (N)  can  be  determined  from  the  equa- 
tion for  the  (N+  1)  -r,t  component.  Repeated  application  of  this  procedure 

a * 

yields  the  values  r (N+l),  r (N+2),  etc.,  from  the  matrix  equation 


n-11 
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r(D 
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r (1) 
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r*(N) 

r*(N-l) 

r*(l) 
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Figure  II-2(b)  illustrates  how  the  backward  extension  of  the  autocorrelation 
function  is  accomplished  using  the  complex  conjugate  of  the  backward  pre- 
diction filter.  Figure  H-2(a)depicts  the  corresponding  forward  extension  with 
the  complex  conjugate  of  the  forward  prediction  filter. 


When  this  maximum  entropy  method  of  autocorrelation  function  extension 
determines  the  unavailable  lag  values,  the  resultant  power  spectrum 

P(f)  = i.  Vs  r(j)  * = 1 YV r(j)  e l2tf-»At 

W W * 

J=  '"®°  j=  -m» 


is  the  maximum  entropy  spectrum.  This  fact  can  be  verified  by  premulti- 

j! 

plying  the  power  spectrum  by  W (V  A)  and  applying  the  method  of  undetermined 


coefficients  to  the  coefficients  of  z 


1 


-(N-l)  -N 


-(N+  j) 


W(VHA)  P(f)  = vha 


N 

AHV 


Conventional  spectral  estimates  with  taper  functions  (Bartlett,  Hanning, 
Hamming,  Parzen,  etc.)  are  exactly  equal  to  the  convolution  of  the  maximum 
entropy  spectrum  with  the  frequency  window  corresponding  to  the  particular 
tapering  method  employed: 
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(b)  Backward  Extension 

FIGURE  n-2 

EXTENSION  OF  THE  AUTOCORRELATION  FUNCTION 
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ws 
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where  j and  k are  integers,  g is  the  Kronecker  delta  operator,  and  w.  is  the 

3 J 

taper  weighting  for  the  autocorrelation  function  r{j). 


The  equations 
N-l 

01  E v>r(j-k) 

j=0 


(for  Jill  positive  integers  k) 
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in  the  autocorrelation  function  extension  matrix  equation  imply  that 


N-l 

"N-l 

II 

o 

* 

a.  , x . x , = 

j+ 1 t- j t-k 

V a.  , x . 
Z- 1 J+1  t-j 

j=0 

* 

* , 
t-k 

or  that  the  cross  correlation  function  between  the  forward  prediction  error 
filter  output  and  the  time  series  values  x^  ^ before  the  filter  output  time 
is  zero  for  all  positive  values  of  k.  Similarly,  the  equations 


N-l 

N-l 

v * 

0=  / a.  , x x , = 

4—i  J+  1 t+j  t+k 

j=0 

L vi  x*+j 

j=  o 

x , 
t+k 

imply  that  the  crosscorrelation  function  between  the  backward  prediction 

error  filter  output  p and  the  time  series  values  x , after  the  filter  output 

t t+k 

time  is  also  zero  for  all  positive  values  of  k.  For  the  values  of  k from  1 to 
N-l,  this  result  is  a consequence  of  the  fact  that  the  (N-l) -point-long  predic- 
tion filter  is  optimum.  For  the  values  of  k greater  than  or  equal  to  N,  this 
result  is  a consequence  of  the  maximum  entropy  assumption.  If  this  assump- 
tion is  correct,  the  (N-l) -point-long  prediction  filter  is  also  the  infinitely  long 
optimum  prediction  filter,  and  the  predictability  of  x^  cannot  be  improved  by  using 
the  time  series  values  x at  lags  k whose  absolute  value  is  N or  greater. 

IT  K 

If  the  predictability  of  can  be  improved,  the  entropy  or  uncertainty  of  the 
time  series  x^  is  less  than 


Still 
• -W/2 


log 


PNAt 

vhaahv 


df. 


Since  the  forward  prediction  error  filter  output  q^  is  a linear  com- 
bination of  earlier  time  series  values  x^  and  the  backward  prediction  error 

filter  output  p is  a linear  combination  of  later  time  series  values  x , , the 
t t+k 

equations 
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imply  that  the  prediction  error  filter  output  has  a white  power  spectrum,  that 
the  filter  output  power  is  P^,  and  that  the  filter  output  power  density  is  P^/  W 
or  P^At.  Figure  II -3  portrays  the  relationship  between  the  input  power  spectrum 
and  the  white  prediction  irror  filter  output  power  spectrum  correspo  iding 
to  the  maximum  entropy  assumption.  The  maximum  entropy  assumption  that 
the  prediction  error  filter  output  is  white  is  equivalent  to  an  autoregression 
model  of  order  N-l  for  a second-order  zero-mean  stationary  time  series. 

The  maximum  entropy  spectrum  is  known  as  an  autoregressive  spectrum  in 
the  statistics  literature  (e.  g.  : Jones,  1974;  Gersh  and  Sharpe,  1973).  The 
assumed  whiteness  of  the  forward  and  backward  prediction  error  filter  outputs 
provides  a means  of  testing  the  maximum  entropy  assumption:  if  the  spectra  of 
the  prediction  error  filter  outputs  are  not  approximately  white,  the  maximum 
entropy  spectrum  is  not  a good  spectral  estimate.  This  fact  is  useful  in 
determining  the  best  length  for  the  prediction  error  filter.  More  formal 
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FIGURE  II -3 

ASSUMED  RELATIONSHIP  BETWEEN  PREDICTION  ERROR  FILTER 
OUTPUT  POWER  SPECTRUM  AND  INPUT  POWER  SPECTRUM 
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methods  of  identifying  the  order  of  the  autoregression  model  such  as  Akaike's 
final  prediction  error  (FPE)  criterion  (Akaike,  1969;  Akaike,  1970a;  Akaike, 
1970b;  Akaike,  1971a)  and  Akaike's  information  criterion  (AIC)  (Akaike,  1971b; 
Akaike,  1972a;  Akaike,  1972b;  Akaike,  1974)  provide  a means  of  objectifying 
the  maximum  entropy  spectrum.  These  methods  are  apparently  not  widely 
known  in  the  geophysical  literature.  Likewise,  methods  of  estimating  the 
spectral  mean  and  variance  are  available  in  the  statistical  literature  (Kromer, 
1969). 


Since  all  zeroes  of 


N-l 


«KV  ■ I . 


E v.  -1 


j=l 


lie  outside  the  unit  circle,  all  zeroes  of  the  forward  prediction  error  filter 
z-  transform 

N-l 

A(z)  . 1+  £ .Jtl  «* 

j=l 


also  lie  outside  the  unit  circle  because  the  roots  of  A(z)  are  simply  the  com- 

H 

plex  conjugates  of  the  roots  of  A V.  Thus  the  forward  prediction  error  filter 
is  a minimum  phase  filter.  Since  its  output  does  not  precede  any  of  its  input 
points,  it  is  also  a causal  filter.  The  z-transform 


C(z)  = 


1 

A(z) 


of  the  inverse  to  the  forward  prediction  error  filter  can  be  expanded  in  a 
power  series  with  no  negative  powers  of  z since  the  forward  prediction  error 
filter  has  no  zeroes  on  or  inside  the  unit  circle.  The  inverse  of  the  forward 


i 
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prediction  error  filter  is  also  a causal  minimum  phase  filter.  The  inverse 
filter  can  be  used  to  construct  the  time  series  x^: 


^ Ck+  1 Vk  E 

k=  0 


a . , x . ' 

j+1  t-j-k 


E 

m-  0 


[min(m,  N-l)  *; 

Cm-n-l  an+ 1 

n=  0 


x. 


t-m 


m 

-E 


since 


R x - x 
°om  t-m  t 


m=  0 


'■(£  wl- 1 

\ j=  0 / \k=0  / m=  0 


min(m,N-l) 

Cm-n-l  &n+  1 


n=  0 


m 


m 

E 

m=  0 


«om  Z 


m 


Thus  the  input  time  series  can  be  generated  from  the  forward  prediction  error 
filter  output  using  a causal  filter  (the  inverse  of  the  forward  prediction  error 
filter).  The  process  is  causally  invertible:  the  causal  forward  prediction  error 
filter  produces  the  forward  prediction  error  filter  output.  If  the  maximum 
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entropy  assumption  is  satisfied  (i.  e. , if  the  time  series  q^  is  white)*  then 
the  forward  prediction  error  filter  output  is  the  innovations  sequence  (Kailath, 
1970;  Kailath*  1968;  Kailath  and  Frost,  1968;  Frost  and  Kailath*  1971;  Kailath 
and  Geesey*  1971;  Kailath  and  Geesey*  1973;  Gevers  and  Kailath,  1973;  Aasnaes 
and  Kailath*  1973)  corresponding  to  the  time  series  xt  when  it  is  a second-order 
zero-mean  stationary  time  series.  Figure  II-4 diagrams  the  relationship  between 
the  innovations  sequence  and  the  input  time  series.  An  equally  apt  and  more 
succinct  name  for  a prediction  error  filter  is  an  innovations  filter  (as  shown 
in  Figure  II~4).The  backward  prediction  error  filter  is  a maximum  phase  filter 
(a  minimum  phase  filter  if  the  direction  of  time  is  reversed).  When  the  maxi- 
mum entropy  assumption  is  valid,  it  generates  a backward  innovations  sequence 
from  present  and  future  values  of  the  time  series  x^.  In  the  vast  majority  of 
practical  geophysical  processing  situations,  the  optimum  prediction  error 
filter  coefficients  tend  to  zero  as  the  length  of  the  filter  increases.  Conse- 
quently, the  prediction  error  filter  output  approaches  an  innovations  sequence 
as  the  filter  length  increases  indefinitely. 

To  solve  the  prediction  error  filter  design  matrix  equation,  a simplified 

form  of  a recursive  algorithm  first  developed  by  Norman  Levinson  (Levinson, 

1947)  and  later  presented  in  the  statistical  literature  (IXirbin,  I960)  is  available. 

The  (N+l) -point-long  prediction  error  filter  can  be  created  from  the  N-point- 

long  prediction  error  filter  through  the  determination  of  the  single  coefficient 
N+l 

in  the  (N+l) -point-long  filter: 
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Causal  Innovations  Filter 
Forward  Prediction  Error  Filter) 


Setting 

aN+l  = 'VPN 

N+l 

yields  the  prediction  filter  weights  (A  + a.., , P_„  = 0)  and  the  power  P , , 

lN+1  1%  N+l 

of  the  (N+l)-point-long  prediction  error  filter  output: 

PN  + aN+l  [•  PN  (aN+l)  j = PN  [*  " *N+!  (am!  * PN+1  . 

The  NXN  solution  proceeds  recursively  from  the  one-by-one  problem,  where 

H [ai  * •]  ■ pi  • 

N+l 

The  amplitude  of  the  coefficient  »N+j  must  not  exceed  one  if  the  (N+l)  X 
(N+l)  autocorrelation  matrix  is  non-negative  definite.  This  condition 

is  equivalent  to  specifying  that  the  predic.ion  error  filter's  z -transform  has 
no  zeroes  inside  the  unit  circle  (see  Appendix  A).  A survey  paper  (Kailath, 

1974)  on  linear  filtering  theory  discuKBes  some  of  the  implications  of  Levinson's 
algorithm. 
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because 
1 


/ N f / N\*  / NY* 

(a  2 / (a  3 j *•••—••  (aNf 

, / N-l\  / N-l\*| 

! ' 2 ) v N-l' 


r(0) 
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r(l) 

r(0) 


r(N-l) 

r(N-2) 


r*(N-l)  r*(N-2) *r(0) 
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N 


N 
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(The  zeroes  to  the  right  of  the  main  diagonal  in  the  last  matrix  occur  because 
the  matrix  is  Hermitian,  and  this  fact  is  true  since  the  matrix  to  the  left  of 
the  Hermitian  autocorrelation  matrix  is  the  conjugate  transpose  of  the  matrix 
to  its  right. ) 

If  w is  substituted  for  r(N)  in  the  (N+  1)  X(N+  1)  autocorrelation  matrix 

II  , , the  determinant  is 
N+  1 

I RN+  1 1 = lRN-l|(C+Wo  w + *)• 


^ i 

where  c is  a term  not  involving  w or  w and  R w is  the  coefficient  for  w . 

L I N-ll  ° 

Since  the  determinant  |R^  t J is  also  the  product  of  the  prediction  error  powers 


In  1 L,  I / 2 | | 2 

i i ^ 

. N+  1 / N+  1\* 

Pn+^Pn-iK'  -|w-wo|  , 

)=  |rn-i|  pn 

' 1 VW  1/ 
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N+  1 


The  determinant  is  maximized  when  a = 0 or,  equivalently,  when  w = w , 

N+ 1 o 


so  that  r = P,,  and 
N 


w 


N -1 


= - ]£  ( 1 j+  l)  * r(N-j>  • 


j=l 


As  shown  earlier,  r{N)  = w is  the  maximum  entropy  extension  of  the  auto- 

o 

correlation  function.  An  alternate  but  equivalent  entropy  definition  requires 


that  the  determinant  |R  be  maximized  by  the  proper  choice  of  r(N)  in 

I N+  1 1 

order  to  maximize  the  entropy  of  the  time  series  (McDonough,  1974). 

The  fact  that  the  maximum  entropy  autocorrelation  function  extension  de- 
scribed here  does  indeed  maximize  the  determinant  [R  T means  that  the 

r N+  1| 

two  entropy  definitions  are  consistent.  Since  the  autocorrelation  matrix  is 

non-negative  definite,  the  autocorrelation  function  value  r(N)  must  lie  on  or 

i0 

within  the  w-plane  circle  described  parametically  by  w^  + e P^.  The  maxi- 
mum entropy  autocorrelation  function  extension  places  the  value  r(N)  precisely 
at  the  center  of  the  circle.  Any  other  choice  of  r(N)  is  biased  in  the  sense  that 

it  adds  information  not  based  on  the  autocorrelation  matrix  R . Thus,  when 

N 

the  autocorrelation  matrix  is  precisely  known  and  nothing  is  known  about 
the  autocorrelation  function  at  lags  of  N or  greater,  the  maximum  entropy 
spectrum  is  the  most  reasonable  power  spectrum  possible. 
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SECTION  m 
THE  BURG  TECHNIQUE 


The  Burg  technique  (Burg,  1968)  is  a method  for  computing  the  pre- 
diction error  filter  coefficients  directly  from  a finite-length  time  series  x^ 

(t=  1,2,  ...  , T-l,  T).  By  using  the  Levinson  recursion  relationship  described 
earlier,  this  procedure  guarantees  that  the  forward  prediction  error  filter  is 
minimum  phase  and  that  the  associated  autocorrelation  matrix  is  non-negative 
definite.  Minimi  zing  the  average  power  of  both  the  forward  and  backward 
prediction  error  filter  outputs  to  obtain  the  filter  coefficients  provides  greater 
reliability  in  their  estimation.  In  this  algorithm,  no  assumptions  about  the 
time  series  before  or  after  the  measured  values  are  made,  so  that  end  effects 
are  avoided.  The  advantages  of  the  Burg  technique  are  parti cularly  pronounced 
when  the  number  of  points  in  the  time  series  x^  is  small. 

Since  the  coefficient  of  a one -point-long  prediction  error  filter  is  one, 
both  the  forward  and  backward  prediction  error  filter  outputs  at  time  t are 
equal  to  x^.  Computation  of  the  prediction  error  power  for  the  one -point-long 
filter  initializes  the  Burg  technique: 


The  autocorrelation  function  value  r(0)  is  equal  to  P^. 

Once  the  N-point-long  prediction  error  filter  outputs  and  coefficients 

are  known,  the  (N+  1)  -point-long  prediction  error  filter  is  formed  from  the  N- 

point-long  prediction  error  filter  through  the  determination  of  the  single  coeffi- 
N+  1 

cient  a..  , in  the  (N+  l)-point-long  filter.  With  the  aid  of  the  Levinson  recur- 
N+  1 

sion  relationship  between  the  N-point-long  and  (N+  1)  -point-long  prediction 


error  filters,  the  forward  and  backward  (N+  1)  -point-long  filter  outputs 
can  be  expressed  as  a linear  combination  of  the  N -point -long  filter  out- 
puts: 

N+l  N+l  N N 
qt+N  = aN+l  Pt  + qt+N 

N+l  N / N+ 1 \*  N 
Pt  s pt  + Pn+J  qt+N  . 

N N 

where  and  p^  are  the  outputs  at  time  t from  the  N -point-long  forward 

and  backward  prediction  error  filters,  respectively.  Figure  ni-1  illustrates 

this  relationship,  which  was  first  stated  explicitly  in  the  literature  by 

N+  1 

Andersen  (Andersen,  1974).  The  coefficient  a^+  ^ is  chosen  to  minimize 
the  average 


of  the  forward  and  backward  prediction  error  powers.  E^+  j i*  a minimum 

when  its  partial  derivatives  with  respect  to  the  real  and  imaginary  parts  of 
N+  1 

a..  are  zero: 
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FIGURE  IH-1 

FORMATION  OF  THE  (N+ 1) -POINT-LONG  PREDICTION  ERROR  FILTER 
OUTPUTS  FROM  THE  N-POINT-LONG  PREDICTION  ERROR  FILTER 
OUTPUTS  USING  THE  LEVINSON  RECURSION  RELATIONSHIP 
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Since  the  summation  in  the  numerator  is  the  inner  product  between  the  vectors 
(Pj.  P2>  •••  » PT_N^  and  (q  qN+2’  ***  * qt^  and  tbe  denominator  is  the 

sum  of  the  squared  magnitudes  of  these  two  vectors,  the  absolute  value  of 

N+  1 , V 1 * 1 <■  2 3 N+  1 

a.T  , never  exceeds  one.  If  the  absolute  values  of  a_,  a,,  ...  , a^T  . are 
N+l  2 3 N+ 1 

all  less  than  one,  then  the  forward  prediction  error  is  minimum  phase  (see 

N+  1 

Appendix  A).  Substitution  of  a , into  the  expression  for  E yields 
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The  (N+  1) -point-long  forward  prediction  error  filter  is  determined 
from  the  Levinson  recursion  relationship: 
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The  prediction  error  power  for  the  (N+  1) -point-long  filter  is  also  determined 
from  the  Levinson  recursive  relationship: 


N+ 


_ f,  N+  1 / N+  1 V * 
1 PN  [ aN+ 1 laN+ 1) 


This  power  is,  in  general,  not  equal  to  E . If  the  absolute  values  of  a , 
3 N+l  h+l  2 

1 jl  • • 

definite  since  the  successive  determinants 


, , a , are  all  less  than  one,  then  the  matrix  R,  , is  positive 
N+ 1 N+ l 


N-rk 

j=i 


(k  = 1,  2,  ...  , N+  1) 


are  all  positive.  If  desired,  the  corner  elements  r(N)  and  r (N)  in  the  matrix 
R^+  can  be  recursively  determined  from  the  bottom  row  of  the  (N+  l ) -point- 
long  prediction  error  filter  design  matrix  equation: 

r<N>  = (a j+V ) 

hi 


Thus  the  matrix  j can  be  determined  from  P^  and  the  successive  pre- 
diction error  filters. 

From  the  prediction  error  power  and  the  N -point-long  prediction 
error  filter  coefficients,  the  maximum  entropy  spectrum 


P(f)  = 


V' 


N=  1 

j=l 


N i27rfiAt 
a . . e 
J+l 
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corresponding  to  the  Burg  technique  can  be  calculated.  Empirical  comparisons 
of  the  maximum  entropy  spectrum  using  the  Burg  technique  with  the  maximum 
entropy  spectrum  using  estimated  autocorrelation  functions  (e.  g. , Radoski, 
Fougere,  and  Zawalick,  1974)  indicate  greater  resolution  and  spectral  accuracy 
is  possible  with  the  Burg  technique.  As  far  as  known,  however,  there  are  no 
definitive  statistical  studies  to  support  this  claim.  Furthermore,  there  is  no 
known  objective  method  for  determining  the  order  of  the  autoregression  model 
when  the  Burg  technique  is  used.  Akaike's  criteria  appear  to  be  inextricably 
linked  to  autocorrelation  functions  estimated  by  the  formula 

T-j 

r(j)  = V xx*. 

T-j  t t+j 

t=  1 

The  statistical  evaluation  of  maximum  entropy  spectra  using  the  Burg  technique, 
therefore,  appears  to  be  an  area  meriting  more  detailed  scrutiny.  At  the 
present  time,  the  choice  of  the  prediction  error  filter  length  for  the  Burg 
technique  is  a matter  of  subjective  judgment. 

Since  the  Burg  technique  computes  the  prediction  error  filter  coefficients 
and  filter  output  power  directly,  the  autocorrelation  function  is  not  needed  for 
the  maximum  entropy  spectrum.  In  fact,  the  autocorrelation  function  lags  0 to 
N-l  or  the  order  N-l  maximum  entropy  spectrum  or  the  N-pointlong  predic- 
tion error  filter  and  its  output  power  contain  equivalent  information.  Figure  TTT-2 
illustrates  the  relationships  between  these  three  functions.  A knowledge  of 
one  permits  the  other  two  to  be  determined.  Figure  III-2  presents  some,  but  by 
no  means  all, of  the  ways  to  accomplish  the  transformations  between  these 
three  functions. 

The  other  principal  use  for  autocorrelation  functions  is  in  time-domain 
digital  filter  design,  where  the  inverse  of  the  autocorrelation  matrix  is 
needed.  As  shown  earlier  in  this  paper,  however,  the  inverse  is 
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Maximum  Entropy  Spectrum 
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Prediction  Error  Power 
and  Prediction  Error  Filter  Coefficients 


I 


yt/z 

-W/2 


(a)  r(j)  = I P(f)  elZ7rfjAtdf 


(b) 


r(0) 

* 


r(l) 


r (1)  r(0)  v.V 

. . 

• * •*«« 

• • •••.. 

ak  * 


r*(N-l)  r*(N-Z) 


r(N-l) 

r(N-2) 

r(0) 


1 

PN 

N 

*2 

• 

0 

• 

• 

• 

•N 

•n 

• 

• 

• 

0 

(c)  P(f)  = 


PNAt 


N-l 


1 + 


EN  i 2 7rf j A t 

aj+  1 6 


j=l 


FIGURE  in- 2 

RELATIONSHIP  BETWEEN  AUTOCORRELATION  FUNCTION, 
PREDICTION  ERROR  FILTER  AND  MAXIMUM  ENTROPY  SPECTRUM 


SECTION  IV 

THE  K-LINE  SPECTRUM 


When  the  crosspower  spectrum  matrix  for  an  equally -spaced  line  array 
is  available,  there  is  a wavenumber  analogue  to  the  frequency -domain  maxi- 
mum entropy  spectrum  described  previously  in  this  paper.  It  is  known  as  the 
K-line  spectrum  and  was  developed  by  John  Parker  Burg  while  at  Texas 
Instruments. 

If  a space-time  wavefield  can  be  described  as  a superposition  of  plane 
waves,  then  the  inverse  Fourier  transform 

-W/2 

r(j>  =J  P(k)e  -i2*kjAxdk 

-W/2 


of  the  wavenumber  spectrum  P(k)  is  equal  to  the  crosspower  spectrum  for 
two  sensors  at  a spatial  displacement  of  jAx,  where  Ax  is  the  distance  be- 
tween two  successive  sensors  in  the  line  array.  This  fact  requires  that  all 
crosspower  spectra  corresponding  to  the  same  spatial  displacement  be 
identj  cal: 


r(j)  = <I>  = 

m,  m+ j 
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is  the  crosscorrelation  function  at  a time  lag  r between  the  output  g (t)  of 

' m 

the  m-th  sensor  and  the  output  g .(t)  of  the  (m+ j)-th  sensor,  and  where 

m+  j 

t denotes  the  time  of  the  sensor  output.  As  a result,  the  NXN  crosspower 
spectrum  matrix 
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for  an  N-element  array  with  sensor  coordinates  0,  Ax,  7.  A x ...  , (N-l)  Ax 
is  a ’foplitz  matrix  as  well  as  a Hermitian  non-negative  definite  matrix. 
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If  W = 1/Ax,  z=e  , and  the  vectors  V and  A are,  respectively. 
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the  wavenumber  maximum  entropy  spectrum  is 
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The  proof  is  very  similar  to  the  proof  for  the  frequency -do main  maximum 
entropy  spectrum  and  has  been  given  previously  (Barnard,  1969). 

In  tic  case  of  the  wavenumber  maximum  entropy  spectrum,  the  forward 

prediction  error  filter  with  weights  (1,  a^»  » a^)  is  a spatial  filter  whose 

output  is  the  error  in  estimating  the  N-th  sensor  from  the  first  N-l  sensors. 

Figure  IV- 1 illustrates  this  situation.  In  the  figure,  X^(f)  denotes  the  Fourier 

transform  of  the  j-th  sensor  output.  The  backward  prediction  error  filter 
★ * 

with  weights  (a  , ...  , a , 1)  outputs  the  error  in  estimating  the  first  sensor 

IN  Cm 

from  the  second  through  N-th  sensors.  The  term  in  the  filter  design  matrix 
equation  is  the  spatial  prediction  error  power  density  spectrum  and  is  equal 
to  both  q^q^  and  Pj  Pj  » where  the  vinculum  denotes  the  crosspower  spectrum 
( or  autopower  spectrum)  of  the  two  quantities  below  it. 


In  practice,  the  assumption  of  space-stationarity  is  not  satisified  and 
the  elements  along  any  diagonal  are  not  equal 


m5  m+  j n , n+  j 


(m=£  n). 


The  easiest  way  to  remedy  this  problem  is  to  average  the  elements  along  each 
diagonal  of  the  crosspower  spectrum  matrix.  The  ensuing  spectrum  is  the 
wavenumber  analogue  of  the  standard  autoregressive  spectrum.  If  this  pro- 
cedure is  performed,  however,  the  resulting  Toplitz  matrix  may  not  be  non- 
negative definite. 

Another  way  to  compute  the  wavenumber  maximum  entropy  spectrum 
is  to  use  a modified  version  of  the  Burg  technique.  The  procedure  begins  by 
averaging  the  autopower  spectra  to  estimate  the  1 -point-long  prediction  error 
power  density  spectrum: 


IV- 3 


X^f)  x2(f) 


XN-l(f)  X-(f) 


aN(f)  aN-l(f) 


■a2(f)  a^f)  = 1 


Forward  Prediction  Error  = = > a.  ,(f)  X^,  .1 

N j+ 1 N-j 

j=0 

(a)  Forward  Spatial  Prediction  Error  Filter 


X.,  ,(f)  X If) 


a,  (f)  = 1 a-  (f) 


a.,  ,(£)  a ff) 


N 

N 

^IS5 

ii 

H 

& 

E 

* 1 

X X = - 

mm  N 

£ 

m=  1 

m=  1 

mm 


(A 


Then  the  algorithm  continues  recursively  with  the  minimization  of  the  spatial 
prediction  error  power  density  spectrum 
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and  where 
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Aside  from  the  way  a^+  J is  computed,  the  rest  is  the  same  as  for  the  Burg 

M+  1 

technique  proper.  Normally  (but  not  always)  the  procedure  continues  until  the 
prediction  error  filter  length  is  the  same  as  the  number  of  sensors. 

The  absolute  value  of  a^+  J never  exceeds  one  since 

M+  1 


~m  r m \* 

^m+  M vm+  M/ 


(The  absolute  value  of  a crosspower  spectrum  never  exceeds  the  average 
value  of  the  two  corresponding  autopower  spectra). 

Because  the  prediction  error  filter  outputs  are  not  available,  the 
crosspower  spectrum  matrix  must  be  used  in  applying  the  Burg  technique 
to  wavenumber  spectra.  Th's  fact  makes  the  required  computations  more 
cumbersome,  especially  for  arrays  with  many  sensors.  If,  in  certain  situa- 
tions, some  loss  in  spectral  resolution  and  accuracy  can  be  tolerated,  the 
wavenumber  analogue  to  the  standard  autoregressive  spectrum  provides  a 
reasonable  alternative. 
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APPENDIX  A 

TEST  FOR  MINIMUM  PHASE  SAMPLE  POINT  FILTERS 
(by  John  Parker  Burg) 


In  this  appendix,  a simple  test  is  developed  to  determine  whether  a 
finite  length  sample  point  operator  is  minimum  phase.  The  development  of 
this  test,  which  is  known  as  Jury's  stability  test  (Jury,  1962)  in  the  theory  of 
sampled  data  control  systems,  is  contained  in  the  book  The  Geometry  of  the 
Zeros  by  M.  Marden  (Marden,  1949).  The  justification  for  presenting  this 
test  and  two  important  theorems  about  the  zeroes  of  a polynominal  in  condensed 
form  is  their  apparent  newness  to  geophysical  workers  in  the  field  of  digital 
filter  theory. 

A sample  point  operator  is  defined  here  to  be  minimum  phase  if  and 
only  if  both  the  filter  and  its  inverse  sample  point  filter  are 


1)  physically  realizable  (i.  e.  , their  z-tranforms  are  Taylor 

n 

series  in  z of  the  form  a + a,  z + . . . + a z + ...),  and 

o 1 n 

2)  stable  (i.  e.  , their  z-transform  series  converge  for  Izl  = 1 ). 


This  definition  is  equivalent  to  saying  that  the  z-transform  of  a mini- 
mum phase  sample  point  operator  is  both  analytic  and  non- zero  on  and  inside 
the  unit  circle.  Since  a finite  length,  physically  realizable,  sample  point 
operator  is  analytic  for  |z|<«  , the  test  for  minimum  phase  becomes  one  of 


determining  whether  all  the  zeroes  of  a polynominal,  a + a,  z + . . . + a.T  z 

o 1 N 

are  outside  the  unit  circle. 


N 


To  develop  the  test,  two  important  theorems  about  the  zeroes  of  a poly- 
nominal are  first  proved.  These  two  theorems  and  the  test  are  valid  for  poly- 
nomials with  complex  coefficients. 
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Principle  of  Argument  Theorem 


Let  C be  a simple  closed  Jordan  curve  in  the  complex  plane  and  let 
F(z)  be  a polynominal  in  z,  none  of  whose  zeroes  lie  o..  Let  A be  the 
total  phase  shift  in  F(z)  as  the  point  z traverses  C once  in  a counterclock- 
wise direction.  Then  the  number  of  zeroes  of  F(z)  inside  C,  counted  with 
their  multiplicities,  is  given  by  A/ 2 n. 

N 

Proof:  Factoring  F(z)  = a + a,  z + . . . + a,_zN  into  a^T  I I (z-  a 

o 1 N Nil  1 

i=  1 

we  see  that  each  root  which  lies  inside  C contributes  2 n to  the  total  phase 
shift  of  F(z),  but  that  each  root  outride  of  C has  zero  contribution. 

Rouche's  Theorem 

If  P(z)  and  Q(z)  are  two  polynominals  in  z for  which  |P(z)|  > |q(*)I  on 
a simple  closed  Jordan  curve  C,  then  the  polynomial  F(z)  = P(z)  + Q(z)  has 
the  same  number  of  zeroes  inside  C as  does  P(z). 

Proof:  We  should  first  note  that  since  |p(z)|  > |Q(z)jon  C,  P(z)  and  F(z) 
cannot  have  any  zeroes  on  C.  Writing  F(z)  = P(z)  [l  + Q(z)/P(z%,  the  total 
change  in  the  argument  of  F(z),  as  C is  traversed  once  in  a counterclockwise 
direction,  is  the  sum  of  the  total  phase  shift  of  P{z)  and  1 + Q(z)/P(z).  But 
since  |Q(z)U|P(z)|  for  z on  C,  the  real  part  of  1 + Q(z)/P(z)  is  always  posi- 
tive for  z on  C and  thus  has  a total  phase  shift  of  zero.  Therefore,  F(z)  and 
P(z)  have  the  same  total  pha3e  shift  and  thus,  from  the  Principle  of  Argument 
Theorem,  they  have  the  same  number  of  zeroes  inside  C. 

Minimum  Phase  Theorem 

A finite  length  sample  point  operator  is  minimum  phase  if  and  only  if 

its  z-transform  is  given  by  a Qfz),  where  Q..(z)  satisfies  the  recursive 

o In  N 

procedure 
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Qo{«)  = 


Qn('>  ■ Qn-1<‘>  + r„’”[Q„-l  t'‘T  (A‘  l> 


with  all  |rnj<  | 


Proof:  We  prove  that  a filter  generated  by  this  recursive  procedure 
will  be  minimum  phase  by  noting  that 


1) 


2) 


2 n 

The  Q (z)  are  of  the  form  1+  a.  z + a_z  + . . . + a z 
n i 6 


n 


and  thus  are  analytic  on  and  inside  the  unit  circle,  and 


Starting  with  Q (z)  = 1,  which  has  no  zeroes  on  or  inside 
o * - ' • 

the  unit  circle,  we  see  that  p (z)[ 


for  |z|  = 1 and  thus  jQ^  j(z)|>|r^zn 
the  unit  circle.  Therefore,  using  Rouche's  Theorem 


K-,  (■•'•)]■ 

[«.,  (■-  in 


repeatedly,  Q^(z)  will  have  no  zeroes  on  or  inside  the 

unit  circle  and  thus  a Q.,(z)  will  be  minimum  phase. 

o N 

That  any  finite  leqgth  minimum  phase  filter  can  be  obtained  from  (A.  1) 
is  proved  by  seeing  that  the  reverse  of  the  recursive  procedure  is  unique  and 
that  alljr^l  will  be  less  than  one.  Letting  F^(z)  be  the  z-transform  of  an  N+ 1 
point  minimum  phase  filter  after  normalizing  the  first  term  to  one,  we  can 
write 

. . 2 . N-l  . N 

F(z)  = l + b^z  + b^z  + ...  + b^  jZ  + b^z 


r.  2 N-l] 

s |1  -t-  c x z + c jjZ  + ...  + cN_L  z j 

„ [CN- 


+ b 


* 2 

-1  Z + CN-2  Z + 


* N-l 
+ z + z 


1 


(A.  2) 


where  the  c are  determined  by  the  equation 
8 
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(A.  3) 


Since  F (z)  is  minimum  phase,  all  of  its  roots,  a.  , are  greater  than 
N 1 

one  in  magnitude.  Therefore,  since 


|b^  |<  jand  (A.  3)  will  always  have  a unique  solution.  Furthermore,  from 
Rouche's  Theorem,  the  first  polynominal  on  the  right  hand  side  of  (A.  2)  will 
also  be  minimum  phase.  Thus,  the  recursive  procedure  of  (A.  1)  is  uniquely 
reversible  with  all  |r^j<l  fora  finite  length  minimum  phase  filter. 
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APPENDIX  B 


COMPARISONS  OF  THE  CHARACTERISTICS  OF  MAXIMUM  ENTROPY 
AND  DISCRETE  FOURIER  TRANSFORM  SPECTRAL  ESTIMATION 


In  this  appendix,  a brief  comparison  is  made  between  the  characteristics 
of  max. mum  entropy  and  discrete  Fourier  transform  spectral  analysis.  This 
appendix  is  taken  from  a contractor  report  (King,  Swindell,  and  O'Brien,  1974) 
and  was  written  by  William  H.  Swindell,  Jr.  Several  of  these  comparisons  were 
obtained  from  an  excellent  paper  by  Lacoss  (1971). 

1.  Estimation  of  Autocorrelation  Function 

ME  : Uses  known  lag  values  unmodified.  Unknown  lags 

are  estimated  in  an  optimum  manner. 

DFT  : Weights  all  lag  values  with  some  function  introducing 

spectral  window  effects  in  the  spectrum. 

2.  Spectral  Window  Effects 

ME  : No  spectral  window  effects  as  such  are  introduced 

since  the  autocorrelation  function  is  known  or  estimated 
for  all  lags.  However,  similar  but  greatly  reduced 
effects  occur  when  the  prediction  error  filter  does 
not  create  a perfectly  white  output. 

DFT  : Window  effects  are  always  present.  Spectral  estimates 

are  exactly  equal  to  the  convolution  of  the  maximum 
entropy  spectrum  with  the  frequency  window  corre- 
sponding to  the  particular  tapering  method  employed. 
These  estimates  tend  to  be  the  convolution  of  the  window 
function  and  the  true  spectrum. 
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3.  Estimate  of  Peak  Power  Density  of  a Pure  Tone  of  Power  P 

2 2 

ME  : Proportional  to  P N where  N is  the  number  of 

measured  autocorrelation  function  lag  values. 

DFT:  : Proportional  to  P but  subject  to  picket  fence  effect 

from  spectral  v'ndow. 

4.  Estimate  of  Bandwidth  of  a Pure  Tone 

2 

ME  : Proportional  to  1/(PN  ).  ' 

DFT  : Proportional  to  1/N. 

5.  Estimate  of  Spectral  Power  in  a Pure  Tone 

ME  : Proportional  to  P. 

DFT  : Proportional  to  P/N. 

6.  Estimate  of  Line  Frequency 

ME  : Difficult  to  define  but  can  be  estimated  very  closely. 

(See  Chen  and  Stegun,  1974,  and  Ulrych,  1972). 

DFT  : + 1/  (2N  At). 

7.  Spectral  Reliability 

ME  : Difficult  to  define.  Asymptotically,  for  data  with 

low  spectral  contrast,  the  degrees  of  freedom,  K,  is 
less  than  or  equal  to  L/N  where  L is  the  number 
of  data  points  in  sample  . 

DFT  : K=2L/N  for  Bartlett  window. 

8.  Linearity  of  Spectra 

ME  : Estimation  is  nonlinear.  The  spectrum  of  the  sum 

of  two  time  series  is  not  equal  to  the  sum  of  the 
spectra  of  the  individual  time  series. 


1 
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DFT  : Estimation  is  linear.  Superposition  of  spectra  is 

valid. 

9.  Resolution  of  Closely  Spaced  Spectral  Lines 

ME  : Resolving  power  is  data-dependent  and  difficult  to 

define  but  lines  can  be  separated  at  a frequency 

2 

increment  approximately  proportional  to  1/N  . 

DFT  : Lines  can  be  separated  at  a frequency  increment 

proportional  to  1/N. 

10.  Spectral  Line  Detectability 

The  maximum  entropy  spectrum  is  clearly  superior  to  the  discrete 
Fourier  transform  spectrum  for  weak  signals  in  short  data  samples  where 
the  discrete  Fourier  transform  processing  gain  is  low.  For  long  data  samples 
where  the  natural  line  width  is  greater  than  l/(NAt),  this  advantage  is  reduced. 
For  fixed  N,  taking  longer  data  gates  L increases  detectability  for  the  maxi- 
mum entropy  spectrum  because  of  better  autocorrelation  lag  value  estimates. 
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APPENDIX  C 

THE  USE  OF  THE  BURG  TECHNIQUE  IN 
FILTERING  SHORT  RECORDS 

This  appendix  summarizes  a method  for  filtering  short  data  records. 
The  algorithm,  described  in  a recent  paper  (Ulrych,  Smylie,  Jensen,  and 
Clarke,  1973),  applies  a prediction  filter  to  the  data  to  extend  the  original 
time  series,  Fourier  transforms  the  extended  time  series,  multiplies  the 
Fourier  transform  by  the  desired  filter  frequency  response,  and  obtains  the 
filtered  time  series  through  inverse  Fourier  transformation. 

Suppose  that  the  Burg  technique  has  been  used  to  design  an  N-point- 
long  prediction  error  filter  from  a short  time  series  x^  (t=  1,2,  ...  , T). 

If  the  maximum  entropy  assumption  is  valid,  the  autocorrelation  function  sat 
isfies  the  relationships  incorporated  in  the  following  matrix  equation: 
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The  optimum  (N-l) -point-long  forward  prediction  filter  with  prediction 
distance  M,  on  the  other  hand,  is  obtained  from  the  filter  design  equation 
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r (1) 


r<0) 


r*(N-2)  r*(N-3) 


r(N-3) 

*r(0) 


M 

N-l 
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r (M+l) 


r (M+N-2) 


From  the  first  of  these  two  matrix  equations,  it  is  clear  that 
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that  the  one-step  forward  prediction  filter  output  at  time  T+  1 is 

N-l 

£(T+1)  = a.+  1 x{T+  i-j). 

j=l 

For  2<_M<_N-1, 
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and  premultiplication  of  this  equation  by  the  inverse  of  the  (N-l)X(N-l)  auto- 
correlation matrix  R^  ^ yields  the  M-strp  prediction  filter 
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so  that  the  M-step  forward  prediction  filter  output  at  time  TiM  is 


M-l 

N-l 

x(T+M)  = a.+  1 x(T+M-j)  - 

]C  aj+ l *(T+M-i) 

j=l 

j=M 
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For  M^N 
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so  that  the  M-step  forward  prediction  filter  output  at  time  T+M  is 


N-l 

x(T+M)  = - Y.  a J x(T+  M-j). 

j=l 
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Thus  the  optimum  M-step  forward  prediction  filter  outputs  &(T+M) 

from  the  final  N-l  measured  points  can  be  formed  recursively  using  the  (N-l)  - 

point-long  prediction  filter  with  the  weights  (-a_,  -a,,  ...  , -a  ) obtained 

2 3 N 

from  the  Burg  technique.  First,  the  one-step  forward  prediction  filter  output 
is  formed  from  the  final  N-l  measured  points: 

N-l 

*( T+  1)  = - ^ a x (T+  1-j)  . 

j=l 

Later,  in  place  of  the  unmeasured  points  in  the  formula 

N-l 

x(T+M)  = - y a.+  1 x (T+  M-j)  , 

j=l 

the  prediction  filter  outputs  x(T+M-j)  are  substituted  for  the  points  x(T+M-j) 
where  no  measurements  are  available.  Similarly,  the  optimum  M-step  back- 
ward prediction  filter  outputs  x(l-M)  from  the  initial  N-l  measured  points  can 
also  be  formed  recursively  starting  with 

N-l 

s<0>  = - 2 a*+i  x<j) 

j=l 

and  continuing  with  the  appropriate  substitutions  in 

N-l 

x(l-M)  = - a*+i  x(j+ 1 -M)  (Mi  2). 

According  to  the  originators  of  this  met  jd,  predicting  the  time  series  to  a 
total  len^.h  four  or  five  times  the  original  length  produces  acceptable  filtering 
results. 
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Once  the  extended  time  series  has  been  created,  the  discrete  Fourier 
transform  of  the  extended  time  series  is  multiplied  by  the  desired  filter 
response  in  the  frequency  domain,  and  the  filtered  trace  is  obtained  from  the 
inverse  Fourier  transform. 

ror  a more  detailed  exposition  of  this  method  and  for  illustrative 
examj  i,  the  reader  is  referred  to  the  original  paper  (Ulrych,  Smylie, 
Jensen-  d Clarke,  1973). 
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OBTAINING  REFINED  ESTIMATES 
OF  SPECTRAL  PEAK  PARAMETERS 


In  the  maximum  entropy  spectrum,  the  power  in  the  band  immediately 
surrounding  a spectral  peak  is  a more  reliable  indication  of  line  strength 
than  the  maximum  power  density.  This  appendix,  written  by  William  H. 

Swindell,  Jr.  , and  based  on  John  Parker  Burg's  notes,  is  taken  from  a con- 
tractor report  (King,  Swindell,  and  O'Brien,  1974).  It  presents,  among 
other  things,  an  excellent  method  for  integrating  the  power  around  a spectral 
peak  in  the  maximum  entropy  spectrum. 

Pertinent  parameters  of  a spectral  peak  in  a power  density  spectrum 

are: 

• Center  frequency 

• Peak  spectral  density 

• Total  spectral  power 

• Bandwidth  at  -3  dB  points. 

Because  of  the  extreme  sharpness  of  some  spectral  peaks  in  a maxi- 
mum entropy  power  spectrum,  there  is  considerable  difficulty  in  finding  the 
value  of  the  peak  density,  the  total  power,  and  the  bandwidth  of  the  peak.  Unless 
the  frequency  response  of  the  prediction  error  is  measured  with  a sufficiently 
fine  frequency  increment,  highly  misleading  spectral  estimates  may  result. 

In  Figure  D-l,  an  example  is  shown  of  a maximum  entropy  power  spectrum 
which  is  evaluated  at  frequencies  separated  by  an  increment  Af  resulting  in 
power  density  estimates  designated  by  the  large  dots.  A dashed  line  connecting 
the  dots  is  the  line  which  would  be  seen  in  an  ordinary  plot.  A peak  is  indicated 
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at  P with  the  power  densities  of  P and  P on  each  side.  The  true 

£ 1 J 

response  of  the  filter  is  shown  by  the  continuous  line  at  a true  center 
frequency  of  f It  is  obvious  from  the  figure  that  the  indicated  peak 

density  P is  almost  10  dB  smaller  than  the  true  peak  and  that  the  center 
frequency  is  actually  closer  to  f^  than  f^  . 

A means  of  obtaining  better  estimates  of  peak  density,  etc.,  is  outlined 
below.  The  technique  is  based  on  the  fitting  of  a curve  representing  the  response 
of  a resonant  circuit  to  the  spectral  estimates  P^  , P^  , and  P^  . The  complex 
response  of  a maximum  entropy  filter  near  the  region  of  small  filter  response 
(i.  e.  , a spectral  peak)  is  shown  in  Figure  D-2.  If  &f  is  much  less  than  1/T, 
where  T is  the  length  of  the  filter,  the  complex  response  in  the  narrow  fre- 
quency band  near  a point  of  minimum  response  can  be  approximated  by  a straight 
line.  Then  from  solid  geometry: 

P (f)  = d2  + m2  (f  - f ) 2. 
r o 


Thus  an  approximation  to  the  power  spectrum  near  f is  proportional  to 

o 

1 1 


P(f)  - 


Pr(£>  ' d2+m2(f-f) 


Defining: 

2 

a = 1/d  = peak  value 

b = 2d  / m = 2/m  4^  - bandwidth 


then  we  wish  to  fit  the  following  curve  to  the  spectrum 


P(f) 


a 

1 + 4/b2  (f  - f )2 
o 


(D-l) 
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FIGURE  D-2 

COMPLEX  FREQUENCY  RESPONSE  OF  THE  PREDICTION 
ERROR  FILTER  NEAR  A MINIMUM 
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where: 


P(f)  = a for  f = f 

- a/ 2 for  f=  f i b/2 
o 


and  the  total  power, 


T , 
P 


is: 


T 

P 


P<f)  df 


7rab 

2 


(I 


Substituting  the  three  values  of  power  density  about  a peak 

(P  (f  ),  P (f  ),  P (f  ))  into  equation  (D-l),  the  parameters  a,  b,  ana  f 
1 1 2 2 3 3 


can  be  solved  for: 


where 


f = f,  + QAf 
o 2 


= P. 


1 + 


2P1  2 
V -d-Q)  j 


(1 

0 


= 2 Af 


2P, 


d-Q) 


1/2 


R = P,  + P,  _ 


2P  P 
13 


Occasionally,  when  the  frequency  interval  Af  is  too  large,  the 
2 

quantity  (1  - Q)  exceeds  2P^/R  in  equation  (D-5)  causing  imaginary 
solutions.  A real  solution  can  always  be  obtained,  however,  if  Af  is  suffic- 
iently small.  In  that  event,  the  spectrum  is  interpolated  at  the  midpoints 
between  P and  P and  P . This  results  in  five  power  density  estimates 
at  Af/2.  The  new  peak  density  and  its  adjacent  values  are  then  used  to  obtain 
a solution.  This  interpolation  procedure  may  be  repeated  as  often  as  necessary. 
It  is  advantageous,  however,  to  use  a rather  small  Af  to  start  with  so  that  the 
spectral  density  plots  will  give  a fairly  accurate  picture  of  the  true  spectrum. 
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APPENDIX  E 

ADAPTIVE  IMPLEMENTATION  OF  THE  MAXIMUM  ENTROPY  SPECTRUM 


The  basic  idea  behind  this  technique  is  to  smooth  exponentially  the  pre- 
diction error  filter  output  products  used  to  estimate  P and  the  ladder  coef- 

J+l  1 

ficients  aj+j  (J  = 1,  2,  . ..,  N-l)  in  the  Burg  technique  (see  Section  III). 

Older  prediction  error  filter  output  points  are  weighted  slightly  less  than  their 

immediate  successors: 
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where  0 < k < 1 . 


The  new  prediction  error  filter  outputs  needed  at  time  t = TAt  are 
computed  from  the  prediction  error  filter  ladder  coefficients  obtained  at 
t = (T-l)At.  However,  previously  computed  prediction  error  filter  outputs 
used  in  forming  the  new  prediction  error  filter  outputs  are  not  recomputed  as 
the  prediction  error  filter  is  updated.  In  th‘  /ay,  the  number  of  arithmetic 
operations  is  made  proportional  to  the  filter  length  in  points  times  the  number 


of  time  series  input  points.  If  the  ladder  coefficients  change  slowly,  i.  e. , if 
1-k  < < 1,  this  procedure  will  have  only  a minor  effect  on  the  prediction  error 
filter  outputs  and  ladder  coefficients. 

The  numerator  and  denominator  terms  Uj(T)  and  Vj(T)  (J  = 1,2,... 

, N)  are  recursively  updated  as  each  new  time  series  value  becomes 
available,  and  the  one-point  prediction  error  filter  output  power  and  the  ladder 
coefficients  (J  = 1,2,...,  N-l)  are  computed  in  the  following  manner: 


P j(T)  = 


kU.  (T-l)  + xx* 
1 * 1 T T 

kVj(T-l)  + 1 


J+l  x 
aT  , i (T) 


kUJtl(T..l)-  2(pj./4 

kvJ+i<T-‘'+  k-X./+  4<>*. 


Initially,  the  numerator  and  denominator  terms  U (T)  and  V (T)  as 

J+l  J J 

well  as  the  ladder  coefficients  a^^T)  are  zero  at  T = 0.  During  initializa- 
tion (T  = 1,2,...,N),  only  enough  points  of  the  time  series  are  available 
to  compute  the  prediction  error  filter  outputs  for  the  T-point-long  prediction 
error  filter,  so  that  the  T-th  ladder  coefficient  cannot  be  updated  until  x^ 
is  available. 


Figure  E-l  is  a flow  chart  of  the  procedure  for  recursively  updating 

Pj  and  the  ladder  coefficients  a^*!  . Before  the  update  at  time  t = TAt,  the 

J N N N-l  N-2 

processed  time  series  is  of  the  form  . . . , PT_N_1»  P^.n*  PT-N+1  ’ PT-N+2  ’ 

3 2 1 

....  p,j,  3 , p^,  ^ , Pj_i  = x-p  j • x<p  • After  the  update,  the  processed  time 

. - , N N N N-l  4 

senes  is  o{  the  form  p”_n_,  . PT.N . PT.N+1  . PT.N+2 PT.j. 

3 2 1 

PT  2 ’ PT  1 ’ PT  = XT  ’ °*  t*le  backward  prediction  error  filter  output 
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PROCEDURE  FOR  RECURSIVE  UPDATE  OF  DATA  POWER  AND 
PREDICTION  ERROR  FILTER  LADDER  COEFFICIENTS 


points  needed  for  the  next  update  overlay  the  corresponding  points  of  the  input 
time  series.  The  successive  forward  prediction  filter  outputs  q^,  (J  = 1,2, 

. . . , N-l)  at  time  t = TAt  are  computed  as  needed  in  the  ladder  coefficient 

j 

update.  Note  that  the  backward  prediction  error  filter  outputs  p^,  and  the 
forward  prediction  error  filter  outputs  q^,  in  the  ladder  coefficient  update  are 
both  computed  from  the  prediction  error  filter  obtained  during  the  update  at 
time  t-  (T-l)At  . The  recursive  procedure  described  here  has  been  outlined 
previously  in  general  terms  (Riley  and  Burg,  1972)  but  not  explicitly  stated 
and  differs  from  the  technique  actually  implemented  by  Burg  (Burg,  1974). 


To  compute  a maximum  entropy  spectrum,  the  Levinson  recursion 


relations 


and 


provide  the  N-point-long  prediction  error  power  F and  the  prediction  error 

N N N N N 

filter  (1,  a.,  , , . . . , a^  ^ , a^)  needed  in  the  power  density  spectrum 

formula 


P At 
N 

N-l 

, , N i2ff fJAt 


P(f) 


2 


The  power  density  spectrum  requires  considerably  more  computational  effort 
than  a single  update  of  and  the  ladder  coefficients.  If  desired,  the  power 

density  spectrum  can  be  computed  only  at  specified  time  intervals  instead  of 
after  every  update  in  order  to  reduce  the  computational  load. 

If  k = t/(t~  1),  k"m=  e when  m = 1/ln  [ (t  + 1)/t  ] or  approximately 
r + 0.  5 when  r is  large,  so  that  the  time  constant  for  the  adaptive  maximum 
entropy  spectrum  is  approximately  (t+  0.  5)At  . After  M updates,  the  num- 
ber of  degrees  of  freedom  in  estimating  the  power  is 


m~-l 


As  M becomes  large,  this  value  approaches  1/(1  -k)  or  t + 1 if  k = t/(t+1). 
The  effective  time  delay  is 
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12  (n.-l)kM-‘ 

m=l  
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- k 
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after  M updates  and  approaches  -kAt/(l*k)  or  -r  \t  as  M becomes 
large. 

To  test  the  adaptive  maximum  entropy  spectral  analysis  technique  de- 
scribed here,  a 1000-point-long  chirp  waveform 


xT  = x(TAt)  = 


(T-l)2 

2TMAX 


(T  = 1,2 TMAX) 


was  processed.  The  instantaneous  frequency  of  this  waveform  is 


E-5 


f(TAt) 


T-l 


2TMAXAt 

The  length  of  the  prediction  error  filter  was  fifty  points.  The  time  constant 
used  in  the  recursive  adaptive  update  was  100 At,  or  twice  the  prediction  error 
filter  length.  After  every  ten  points,  the  maximum  entropy  spectrum  corres- 
ponding to  the  current  prediction  error  filter  and  the  current  prediction  error 
power  was  computed  over  the  frequency  band  0 to  0.5/At  at  a frequency  in- 
crement of  0.0005/At.  Figure  E-2  displays  the  resulting  adaptive  power  den- 
sity spectra  in  logarithmic  form.  As  time  increases,  corresponding  spectral 
levels  are  plotted  at  a higher  level  in  the  figure.  After  a brief  warmup  period, 
the  principal  spectral  peak  closely  follows  the  instantaneous  frequency  of  the 
input  time  series.  Because  of  the  exponential  smoothing  applied  to  the  input 
data,  the  trend  of  the  spectrum  is  a linear  increase  (on  a logarithmic  scale)  as 
the  frequency  rises  to  the  frequency  of  the  principal  spectral  peak  and  then  a 
sharp  dropoff  out  to  the  folding  frequency.  Figure  E-3  illustrates  the  ability 
of  the  adaptive  maximum  entropy  spectrum  to  track  the  instantaneous  frequency 
of  the  chirp  waveform.  In  this  figure,  the  frequency  of  the  maximum  spectral 
intensity  is  plotted  as  a function  of  time.  Except  for  a Bmall  number  of  points, 
the  resulting  path  is  very  close  to  linear.  The  time  lag  of  the  spectral  peak 
relative  to  the  instantaneous  frequency  is  about  ten  points,  or  only  one  tenth  of 
the  time  constant. 
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FIGURE  E-2 

ADAPTIVE  MAXIMUM  ENTROPY  SPECTRUM  OF  CHIRP  WAVEFORM 
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